10 research outputs found
On geometric properties of enumerations of axis-parallel rectangles
We show that for any set of non-overlapping axis-parallel rectangles in the plane, there exists a sloping enumeration, such that the numbers of rectangles intersected by any line with a non-negative slope increase along this line. Such enumeration can be computed in the optimal time Θ(n log n) using linear space. The notion of a sloping enumeration can be generalized to higher dimensions; however, already in three-dimensional space it may not exist. We also consider a strip packing problem for a set of rectangles with a fixed enumeration, which is required to be sloping for the resulting packing. This problem is proved to be NP-hard in any dimension d ≥ 2.Russian Foundation for Basic Researc
De Novo Sequencing of Peptides from High-Resolution Bottom-Up Tandem Mass Spectra using Top-Down Intended Methods
Despite high-resolution mass spectrometers are becoming accessible for more and more laboratories, tandem (MS/MS) mass spectra are still often collected at a low resolution. And even if acquired at a high resolution, software tools used for their processing do not tend to benefit from that in full, and an ability to specify a relative mass tolerance in this case often remains the only feature the respective algorithms take advantage of. We argue that a more efficient way to analyze high-resolution MS/MS spectra should be with methods more explicitly accounting for the precision level, and sustain this claim through demonstrating that a de novo sequencing framework originally developed for (high-resolution) top-down MS/MS data is perfectly suitable for processing high-resolution bottom-up datasets, even though a top-down like deconvolution performed as the first step will leave in many spectra at most a few peaks
Top-down analysis of protein samples by de novo sequencing techniques
Motivation: Recent technological advances have made high-resolution mass spectrometers affordable to many laboratories, thus boosting rapid development of top-down mass spectrometry, and implying a need in efficient methods for analyzing this kind of data.
Results: We describe a method for analysis of protein samples from top-down tandem mass spectrometry data, which capitalizes on de novo sequencing of fragments of the proteins present in the sample. Our algorithm takes as input a set of de novo amino acid strings derived from the given mass spectra using the recently proposed Twister approach, and combines them into aggregated strings endowed with offsets. The former typically constitute accurate sequence fragments of sufficiently well-represented proteins from the sample being analyzed, while the latter indicate their location in the protein sequence, and also bear information on post-translational modifications and fragmentation patterns.
Availability and Implementation: Freely available on the web at http://bioinf.spbau.ru/en/twister
De Novo Sequencing of Top-Down Tandem Mass Spectra: A Next Step towards Retrieving a Complete Protein Sequence
De novo sequencing of tandem (MS/MS) mass spectra represents the only way to determine the sequence of proteins from organisms with unknown genomes, or the ones not directly inscribed in a genome—such as antibodies, or novel splice variants. Top-down mass spectrometry provides new opportunities for analyzing such proteins; however, retrieving a complete protein sequence from top-down MS/MS spectra still remains a distant goal. In this paper, we review the state-of-the-art on this subject, and enhance our previously developed Twister algorithm for de novo sequencing of peptides from top-down MS/MS spectra to derive longer sequence fragments of a target protein
Validation of De Novo Peptide Sequences with Bottom-Up Tag Convolution
De novo sequencing is indispensable for the analysis of proteins from organisms with unknown genomes, novel splice variants, and antibodies. However, despite a variety of methods developed to this end, distinguishing between the correct interpretation of a mass spectrum and a number of incorrect alternatives often remains a challenge. Tag convolution is computed for a set of peptide sequence tags of a fixed length k generated from the input tandem mass spectra and can be viewed as a generalization of the well-known spectral convolution. We demonstrate its utility for validating de novo peptide sequences by using a set of those generated by the algorithm PepNovo+ from high-resolution bottom-up data sets for carbonic anhydrase 2 and the Fab region of alemtuzumab and indicate its further potential applications
De Novo Sequencing of Peptides from Top-Down Tandem Mass Spectra
De novo sequencing of proteins and
peptides is one of the most
important problems in mass spectrometry-driven proteomics. A variety
of methods have been developed to accomplish this task from a set
of bottom-up tandem (MS/MS) mass spectra. However, a more recently
emerged top-down technology, now gaining more and more popularity,
opens new perspectives for protein analysis and characterization,
implying a need for efficient algorithms to process this kind of MS/MS
data. Here, we describe a method that allows for the retrieval, from
a set of top-down MS/MS spectra, of long and accurate sequence fragments
of the proteins contained in the sample. To this end, we outline a
strategy for generating high-quality sequence tags from top-down spectra,
and introduce the concept of a <i>T</i>-Bruijn graph by
adapting to the case of tags the notion of an <i>A</i>-Bruijn
graph widely used in genomics. The output of the proposed approach
represents the set of amino acid strings spelled out by optimal paths
in the connected components of a <i>T</i>-Bruijn graph.
We illustrate its performance on top-down data sets acquired from
carbonic anhydrase 2 (CAH2) and the Fab region of alemtuzumab
De Novo Sequencing of Peptides from Top-Down Tandem Mass Spectra
De novo sequencing of proteins and
peptides is one of the most
important problems in mass spectrometry-driven proteomics. A variety
of methods have been developed to accomplish this task from a set
of bottom-up tandem (MS/MS) mass spectra. However, a more recently
emerged top-down technology, now gaining more and more popularity,
opens new perspectives for protein analysis and characterization,
implying a need for efficient algorithms to process this kind of MS/MS
data. Here, we describe a method that allows for the retrieval, from
a set of top-down MS/MS spectra, of long and accurate sequence fragments
of the proteins contained in the sample. To this end, we outline a
strategy for generating high-quality sequence tags from top-down spectra,
and introduce the concept of a <i>T</i>-Bruijn graph by
adapting to the case of tags the notion of an <i>A</i>-Bruijn
graph widely used in genomics. The output of the proposed approach
represents the set of amino acid strings spelled out by optimal paths
in the connected components of a <i>T</i>-Bruijn graph.
We illustrate its performance on top-down data sets acquired from
carbonic anhydrase 2 (CAH2) and the Fab region of alemtuzumab
<i>De Novo</i> Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra
There
are two approaches for <i>de novo</i> protein sequencing:
Edman degradation and mass spectrometry (MS). Existing MS-based methods
characterize a novel protein by assembling tandem mass spectra of
overlapping peptides generated from multiple proteolytic digestions
of the protein. Because each tandem mass spectrum covers only a short
peptide of the target protein, the key to high coverage protein sequencing
is to find spectral pairs from overlapping peptides in order to assemble
tandem mass spectra to long ones. However, overlapping regions of
peptides may be too short to be confidently identified. High-resolution
mass spectrometers have become accessible to many laboratories. These
mass spectrometers are capable of analyzing molecules of large mass
values, boosting the development of top-down MS. Top-down tandem mass
spectra cover whole proteins. However, top-down tandem mass spectra,
even combined, rarely provide full ion fragmentation coverage of a
protein. We propose an algorithm, TBNovo, for <i>de novo</i> protein sequencing by combining top-down and bottom-up MS. In TBNovo,
a top-down tandem mass spectrum is utilized as a scaffold, and bottom-up
tandem mass spectra are aligned to the scaffold to increase sequence
coverage. Experiments on data sets of two proteins showed that TBNovo
achieved high sequence coverage and high sequence accuracy